Comparative Study of the Baum-Welch and Viterbi Training Algorithms Applied to Read and Spontaneous Speech Recognition
نویسندگان
چکیده
In this paper we compare the performance of acoustic HMMs obtained through Viterbi training with that of acoustic HMMs obtained through the Baum-Welch algorithm. We present recognition results for discrete and continuous HMMs, for read and spontaneous speech databases, acquired at 8 and 16 kHz. We also present results for a combination of Viterbi and Baum-Welch training, intended as a trade-off solution. Though Viterbi training yields a good performance in most cases, sometimes it leads to suboptimal models, specially when using discrete HMMs to model spontaneous speech. In these cases, Baum-Welch shows more robust than both Viterbi training and the combined approach, compensating for its high computational cost. The proposed combination of Viterbi and Baum-Welch only outperforms Viterbi training in the case of read speech at 8 kHz. Finally, when using continuous HMMs, Viterbi training reveals as good as Baum-Welch at a much lower cost.
منابع مشابه
Generalized Baum-Welch and Viterbi Algorithms Based on the Direct Dependency among Observations
The parameters of a Hidden Markov Model (HMM) are transition and emission probabilities‎. ‎Both can be estimated using the Baum-Welch algorithm‎. ‎The process of discovering the sequence of hidden states‎, ‎given the sequence of observations‎, ‎is performed by the Viterbi algorithm‎. ‎In both Baum-Welch and Viterbi algorithms‎, ‎it is assumed that...
متن کاملBaum-welch Training for Segment-based Speech Recognition
The use of segment-based features and segmentation networks in a segment-based speech recognizer complicates the probabilistic modeling because it alters the sample space of all possible segmentation paths and the feature observation space. This paper describes a novel Baum-Welch training algorithm for segment-based speech recognition which addresses these issues by an innovative use of finite-...
متن کاملImproving Phoneme Sequence Recognition using Phoneme Duration Information in DNN-HSMM
Improving phoneme recognition has attracted the attention of many researchers due to its applications in various fields of speech processing. Recent research achievements show that using deep neural network (DNN) in speech recognition systems significantly improves the performance of these systems. There are two phases in DNN-based phoneme recognition systems including training and testing. Mos...
متن کاملOn Hand Gestures Recognition Using Hidden Markov Models
In this paper several results concerning static hand gesture recognition using an algorithm based on left-right Hidden Markov Models (HMM) are presented. The features used as observables in the training as well as in the recognition phases are based either on the 2D Discrete Cosine Transform (DCT) or on the Principal Component Analysis (PCA). The left-right topology of the HMM together with the...
متن کاملA comparative study on maximum entropy and discriminative training for acoustic modeling in automatic speech recognition
While Maximum Entropy (ME) based learning procedures have been successfully applied to text based natural language processing, there are only little investigations on using ME for acoustic modeling in automatic speech recognition. In this paper we show that the well known Generalized Iterative Scaling (GIS) algorithm can be used as an alternative method to discriminatively train the parameters ...
متن کامل